Gradient descent tries to find the lowest poiht in a fitness landscape by following the direction with the fastest drop in value. It is equivalent to gradient ascent, but with the fitness function inverted. Simulated annealing is form of probabalistic gradient descent where the gradient is used to determine the probability of following the dircetion, rtaher than always following the very fastest route downwards. Backpropagation can also be viewed as a form of gradient descent minimsiing the difference between teh actualand expected outputs of the neural network, but which is perfromed incrementally for each training example in turn.
Used in Chap. 7: page 105; Chap. 9: pages 132, 133, 134; Chap. 12: page 198